Analysis of a bias effect in a tree-based variable impor- tance measure

نویسندگان

  • Marco F. Sandri
  • Paola Zuccolotto
  • P. Zuccolotto
چکیده

The research in the field of data mining has widely addressed the problem of variable selection and several variable importance measures have been proposed in the literature. This paper deals with a frequently used variable importance measure defined in the context of decision trees and tree-based ensemble models like Random Forests and Treeboost. The aim of this paper is to show the existence of a bias effect in this importance measure and to discuss its potentially dangerous effects on variable selection. In addition, a heuristic correction strategy is proposed.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Meta-analysis (systematic review) of profit management antecedents and explaining the effect of company size adjuster

The purpose of the present study is to meta-analyze (systematic review) of profit management antecedents and explain the moderating effect of company size. The statistical population of the article is 100 articles and dissertations published during the years 1387 to 1398. Based on the research method, 48 studies were reviewed as the final sample. The present study was done by meta-analysis usin...

متن کامل

Investigation of the Allometric Models in Estimation of Poplar (Populus deltoides) Height

One of the most important issues in forest biometrics is the use of allometric functions to estimate the tree height by using diameter-height models. Measuring the total height of trees is usually a complex and time-consuming process. In allometric functions, the diameter is measured directly but the height of the tree is an estimate of an allometric model, which will be more accurate if the cr...

متن کامل

مدل‌سازی ارزیابی عملکرد کارکنان با استفاده از سیستم‌های خبره

The aim of this study is to develop an employee performance appraisal model via expert systems. Due to the importance and the value of human resources in organizations, a capable work environment is not recognized unless it considers HR as the main drive. By the same to-ken, to utilize the HR efficiently, a performance appraisal system is needed in which practical precision and simplicity is of...

متن کامل

A bias correction algorithm for the Gini variable importance measure in classification trees

This paper considers a measure of variable importance frequently used in variable selection methods based on decision trees and tree-based ensemble models, like CART, Random Forests and Gradient Boosting Machine. It is defined as the total heterogeneity reduction produced by a given covariate on the response variable when the sample space is recursively partitioned. Some authors showed that thi...

متن کامل

Effect of Bias in Contrast Agent Concentration Measurement on Estimated Pharmacokinetic Parameters in Brain Dynamic Contrast-Enhanced Magnetic Resonance Imaging Studies

Introduction: Pharmacokinetic (PK) modeling of dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) is widely applied in tumor diagnosis and treatment evaluation. Precision analysis of the estimated PK parameters is essential when they are used as a measure for therapy evaluation or treatment planning. In this study, the accuracy of PK parameters in brain DCE...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006